智能论文笔记

Automated Defect Recognition of Castings defects using Neural Networks

Alberto García-Pérez , María José Gómez-Silva , Arturo de la Escalera

分类：计算机视觉

2022-09-06

工业X射线分析在需要保证某些零件的结构完整性的航空航天，汽车或核行业中很常见。但是，射线照相图像的解释有时很困难，可能导致两名专家在缺陷分类上不同意。本文介绍的自动缺陷识别（ADR）系统将减少分析时间，还将有助于减少对缺陷的主观解释，同时提高人类检查员的可靠性。我们的卷积神经网络（CNN）模型达到94.2 \％准确性（MAP@iou = 50 \％），当应用于汽车铝铸件数据集（GDXRAR）时，它被认为与预期的人类性能相似，超过了当前状态该数据集的艺术。在工业环境上，其推理时间少于每个DICOM图像，因此可以安装在生产设施上，不会影响交付时间。此外，还进行了对主要高参数的消融研究，以优化从75 \％映射的初始基线结果最高94.2 \％map的模型准确性。

translated by 谷歌翻译

Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking

Elena Luna , Juan Carlos San Miguel , José María Martínez , Marcos Escudero-Viñolo

分类：计算机视觉

2022-11-28

This letter focuses on the task of Multi-Target Multi-Camera vehicle tracking. We propose to associate single-camera trajectories into multi-camera global trajectories by training a Graph Convolutional Network. Our approach simultaneously processes all cameras providing a global solution, and it is also robust to large cameras unsynchronizations. Furthermore, we design a new loss function to deal with class imbalance. Our proposal outperforms the related work showing better generalization and without requiring ad-hoc manual annotations or thresholds, unlike compared approaches.

translated by 谷歌翻译

Late multimodal fusion for image and audio music transcription

María Alfaro-Contreras , Jose J. Valero-Mas , José M. Iñesta , Jorge Calvo-Zaragoza

分类：计算机视觉

2022-04-06

音乐转录涉及音乐源转化为结构化数字格式，是音乐信息检索（MIR）的关键问题。当用计算术语解决这一挑战时，MIR社区遵循两条研究：音乐文档，这是光学识别（OMR）或录音的情况，这就是自动音乐转录（AMT）的情况。上述输入数据的不同性质使这些字段的条件以开发特定于模式的框架。但是，它们在序列标记任务方面的最新定义导致了共同的输出表示形式，从而可以对合并范式进行研究。在这方面，多模式图像和音频音乐转录包括有效结合图像和音频方式传达的信息的挑战。在这项工作中，我们在后期融合级别探讨了这个问题：我们研究了四种组合方法，以便首次合并基于晶格的搜索空间中有关端到端OMR和AMT系统的假设。一系列性能场景获得的结果（相应的单模式模型产生了不同的错误率）显示了这些方法的有趣好处。此外，四种策略中的两种认为显着改善了相应的单峰标准识别框架。

translated by 谷歌翻译

Ontology-based Context Aware Recommender System Application for Tourism

Vitor T. Camacho , José Cruz

分类：机器学习

2022-12-29

In this work a novel recommender system (RS) for Tourism is presented. The RS is context aware as is now the rule in the state-of-the-art for recommender systems and works on top of a tourism ontology which is used to group the different items being offered. The presented RS mixes different types of recommenders creating an ensemble which changes on the basis of the RS's maturity. Starting from simple content-based recommendations and iteratively adding popularity, demographic and collaborative filtering methods as rating density and user cardinality increases. The result is a RS that mutates during its lifetime and uses a tourism ontology and natural language processing (NLP) to correctly bin the items to specific item categories and meta categories in the ontology. This item classification facilitates the association between user preferences and items, as well as allowing to better classify and group the items being offered, which in turn is particularly useful for context-aware filtering.

translated by 谷歌翻译

Anomaly detection in laser-guided vehicles' batteries: a case study

Gianfranco Lombardo , Stefano Cagnoni , Stefano Cavalli , Juan José Contreras Gonzáles , Francesco Monica , Monica Mordonini , Michele Tomaiuolo

分类：机器学习

2022-12-27

Detecting anomalous data within time series is a very relevant task in pattern recognition and machine learning, with many possible applications that range from disease prevention in medicine, e.g., detecting early alterations of the health status before it can clearly be defined as "illness" up to monitoring industrial plants. Regarding this latter application, detecting anomalies in an industrial plant's status firstly prevents serious damages that would require a long interruption of the production process. Secondly, it permits optimal scheduling of maintenance interventions by limiting them to urgent situations. At the same time, they typically follow a fixed prudential schedule according to which components are substituted well before the end of their expected lifetime. This paper describes a case study regarding the monitoring of the status of Laser-guided Vehicles (LGVs) batteries, on which we worked as our contribution to project SUPER (Supercomputing Unified Platform, Emilia Romagna) aimed at establishing and demonstrating a regional High-Performance Computing platform that is going to represent the main Italian supercomputing environment for both computing power and data volume.

translated by 谷歌翻译

Scaling Painting Style Transfer

Bruno Galerne , Lara Raad , José Lezama , Jean-Michel Morel

分类：计算机视觉

2022-12-27

Neural style transfer is a deep learning technique that produces an unprecedentedly rich style transfer from a style image to a content image and is particularly impressive when it comes to transferring style from a painting to an image. It was originally achieved by solving an optimization problem to match the global style statistics of the style image while preserving the local geometric features of the content image. The two main drawbacks of this original approach is that it is computationally expensive and that the resolution of the output images is limited by high GPU memory requirements. Many solutions have been proposed to both accelerate neural style transfer and increase its resolution, but they all compromise the quality of the produced images. Indeed, transferring the style of a painting is a complex task involving features at different scales, from the color palette and compositional style to the fine brushstrokes and texture of the canvas. This paper provides a solution to solve the original global optimization for ultra-high resolution images, enabling multiscale style transfer at unprecedented image sizes. This is achieved by spatially localizing the computation of each forward and backward passes through the VGG network. Extensive qualitative and quantitative comparisons show that our method produces a style transfer of unmatched quality for such high resolution painting styles.

translated by 谷歌翻译

Structure-based drug discovery with deep learning

Rıza Özçelik , Derek van Tilborg , José Jiménez-Luna , Francesca Grisoni

分类：机器学习

2022-12-26

Artificial intelligence (AI) in the form of deep learning bears promise for drug discovery and chemical biology, $\textit{e.g.}$, to predict protein structure and molecular bioactivity, plan organic synthesis, and design molecules $\textit{de novo}$. While most of the deep learning efforts in drug discovery have focused on ligand-based approaches, structure-based drug discovery has the potential to tackle unsolved challenges, such as affinity prediction for unexplored protein targets, binding-mechanism elucidation, and the rationalization of related chemical kinetic properties. Advances in deep learning methodologies and the availability of accurate predictions for protein tertiary structure advocate for a $\textit{renaissance}$ in structure-based approaches for drug discovery guided by AI. This review summarizes the most prominent algorithmic concepts in structure-based deep learning for drug discovery, and forecasts opportunities, applications, and challenges ahead.

translated by 谷歌翻译

Forecasting through deep learning and modal decomposition in multi-phase concentric jets

León Mata , Rodrigo Abadía-Heredia , Manuel Lopez-Martin , José M. Pérez , Soledad Le Clainche

分类：机器学习

2022-12-24

This work presents a set of neural network (NN) models specifically designed for accurate and efficient fluid dynamics forecasting. In this work, we show how neural networks training can be improved by reducing data complexity through a modal decomposition technique called higher order dynamic mode decomposition (HODMD), which identifies the main structures inside flow dynamics and reconstructs the original flow using only these main structures. This reconstruction has the same number of samples and spatial dimension as the original flow, but with a less complex dynamics and preserving its main features. We also show the low computational cost required by the proposed NN models, both in their training and inference phases. The core idea of this work is to test the limits of applicability of deep learning models to data forecasting in complex fluid dynamics problems. Generalization capabilities of the models are demonstrated by using the same neural network architectures to forecast the future dynamics of four different multi-phase flows. Data sets used to train and test these deep learning models come from Direct Numerical Simulations (DNS) of these flows.

translated by 谷歌翻译

RouteNet-Fermi: Network Modeling with Graph Neural Networks

Miquel Ferriol-Galmés , Jordi Paillisse , José Suárez-Varela , Krzysztof Rusek , Shihan Xiao , Xiang Shi , Xiangle Cheng , Pere Barlet-Ros , Albert Cabellos-Aparicio

分类：人工智能 | 机器学习

2022-12-22

Network models are an essential block of modern networks. For example, they are widely used in network planning and optimization. However, as networks increase in scale and complexity, some models present limitations, such as the assumption of markovian traffic in queuing theory models, or the high computational cost of network simulators. Recent advances in machine learning, such as Graph Neural Networks (GNN), are enabling a new generation of network models that are data-driven and can learn complex non-linear behaviors. In this paper, we present RouteNet-Fermi, a custom GNN model that shares the same goals as queuing theory, while being considerably more accurate in the presence of realistic traffic models. The proposed model predicts accurately the delay, jitter, and loss in networks. We have tested RouteNet-Fermi in networks of increasing size (up to 300 nodes), including samples with mixed traffic profiles -- e.g., with complex non-markovian models -- and arbitrary routing and queue scheduling configurations. Our experimental results show that RouteNet-Fermi achieves similar accuracy as computationally-expensive packet-level simulators and it is able to accurately scale to large networks. For example, the model produces delay estimates with a mean relative error of 6.24% when applied to a test dataset with 1,000 samples, including network topologies one order of magnitude larger than those seen during training.

translated by 谷歌翻译

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

Ioannis Tsiamas , José A. R. Fonollosa , Marta R. Costa-jussà

分类：自然语言处理

2022-12-19

Data scarcity is one of the main issues with the end-to-end approach for Speech Translation, as compared to the cascaded one. Although most data resources for Speech Translation are originally document-level, they offer a sentence-level view, which can be directly used during training. But this sentence-level view is single and static, potentially limiting the utility of the data. Our proposed data augmentation method SegAugment challenges this idea and aims to increase data availability by providing multiple alternative sentence-level views of a dataset. Our method heavily relies on an Audio Segmentation system to re-segment the speech of each document, after which we obtain the target text with alignment methods. The Audio Segmentation system can be parameterized with different length constraints, thus giving us access to multiple and diverse sentence-level views for each document. Experiments in MuST-C show consistent gains across 8 language pairs, with an average increase of 2.2 BLEU points, and up to 4.7 BLEU for lower-resource scenarios in mTEDx. Additionally, we find that SegAugment is also applicable to purely sentence-level data, as in CoVoST, and that it enables Speech Translation models to completely close the gap between the gold and automatic segmentation at inference time.

translated by 谷歌翻译